AD618077 


« .ife, 


COPT 


HARD  COPY  t~ 
VlCaOFiCHE 


/.  a-e> 


DDC 

r?rar?mnrz 

JUL  22  1965 


tEJEinr 

MC-IRA  E 


Best  Available  Copy 


800  CATHEDRAL  OF  LEARNING  •  PITTSBURGH,  PENNSYLVANIA  15213 


University  of 

Pittsburgh 

Generalized 

R  ECORDING 

And 

Dissemination 

Experiment 


A.R.P.A. 

Advanced  Research  Projects  Agency 


Conducted  by 


UNIVERSITY  OF  PITTSBURGH 
COMPUTATION  AND  DATA  PROCESSING  CBNTIR 
PITTSBURGH  PENNSYLVANIA  1*21) 


Progress  in  Computerized  Typesetting 


Lee  Ohringer 


A  reprint  of  a  paper  presented  before 
The  17th  Annual  Meeting  of  the 
Technical  Association  of  the  Graphic  Arts 
Park  Plaza,  Toronto,  Ontario,  Canada 
Jitne  1,  1965 


Progress  in  Computerized  Typesetting 


Lee  Ohringer 

Project  UPGRADE  Coordinator 


University  of  Pittsburgh 
Computation  and  Data  Processing  Center 
800  Cathedral  of  Learning 
Pittsburgh,  Pennsylvania  15213 


Abstract 


Recent  years  have  seen  the  introduction  of  digital  com¬ 
puters  into  the  printing  industry.  Thus  far  computers  have 
been  used  primarily  in  the  accounting  departments  and 
for  limited  typesetting--for  example,  in  newspaper  work. 

At  the  University  of  Pittsburgh  we  are  studying  new  ways 
in  which  the  computer  cm  be  used  to  aid  the  printer.  We 
are  experimenting  with  advanced  concepts  such  as  comput¬ 
erized  editing  routines  and  typesetting  of  complex  material. 
The  programs  which  we  have  written  for  our  computer 
enable  the  editor  to  see  the  changes  he  wishes  effected 
immediately  on  the  text  using  a  display  screen  which  is 
electronically  controlled  by  our  computer. 

We  have  also  developed  computerized  indexing  methods 
and  have  used  our  computer  to  generate  a  dictionary  of 
current  scientific  terms  from  the  text  which  we  have 
collected. 

Presently  two  IBM  1401's,  an  IBM  7070  and  7090,  and  a 
PDP-4  computer  are  available  for  our  research  with  our 
Photon  560. 

This  project  receives  partial  support  from  the  Department 
of  Defense  Advanced  Research  Projects  Agency  under 
contract  SD-186  and  National  Science  Foundation  Grants 
GP  2310  and  G  11309. 


INTRODUCTION 


This  paper  covers  four  areas  of  research  currently 
being  investigated  at  the  University  of  Pittsburgh 
Computation  and  Data  Processing  Center.  The  first  of 
these  is  a  project  to  collect  large  amounts  of  text  in  com* 
puter  compatible  form.  The  second  is  a  user -oriented 
computer  language  which  we  designed  specifically  to 
simplify  research  on  this  and  other  text.  Computerized 
typesetting  comprises  my  third  subject,  and  editing,  for¬ 
mating  and  incorporating  author's  alteration  using  com¬ 
puters  is  the  final  topic  which  I  will  discuss  in  this  paper. 


University  of 
Pittsburgh 
Generalized 
R  ECORDING 

And 

Dissemination 

Experiment 


A.R.P.A. 

Advanced  letearch  Project*  Agency 


Conducted  by 


UNIVERSITY  OF  PITTSBURGH 
COMPUTATION  AND  DATA  PKOCttlINC  CINTIt 

nmiuicM.  PENNtri vania  hid 


Our  entire  efforts  in  this  field  fall  into  the  area  covered 
by  Project  UPGRADE  which  stands  for  the 
University  of  Pittsburgh  Generalized  Recording  and 
Dissemination  Experiment. 


TEXT  COLLECTION 

It  was  our  text  collection  project  which  gave  us  our 
initial  contact  with  the  printing  industry.  Since  our 
Computing  Center  has  been  primarily  devoted  to  develop¬ 
ing  methods  of  text  handling  and  information  processing, 
as  contrasted  with  mathematical  methods  research  done  at 
most  computing  centers,  it  was  natural  that  we  were  the 
ones  requested  by  the  Department  of  Defense  to  develop 
a  means  to  obtain  large  amounts  of  text  in  computer 
readable  form. 

Toward  this  goal,  we  examined  methods  used  by  past 
projects  such  as  punching  the  text  onto  tab  cards  or  paper 
tape.  We  also  considered  optical  character  readers  which 
were  then  being  proposed.  None  of  these  methods 
appeared  to  be  capable  of  meeting  our  needs. 

It  was  then  that  we  turned  to  the  printing  industry  in 
search  of  an  answer.  We  found  that  many  printers  were 
sincerely  interested  in  what  we  were  doing  and  quite 
willing  to  help  in  any  way  they  could.  A  pilot  study  was  set 
up  whereby  we  receive  the  typesetting  tapes  from 
Lancaster  Press  and  from  a  job  that  Kingsport  Press  was 
doing  for  McGraw-Hill.  Since  that  time  the  list  of  printers, 
publishers,  and  research  centers  from  whom  we  have 
received  advice  and  co-operation  has  grown  so  that  today 
over  fifty  have  contributed  significantly  to  our  efforts. 

The  following  slide  shows  many  of  those  to  whom  we  owe 
credit  for  mu<.h  of  our  success. 


«o  hiss,  ire. 
amrican  CNfmcu  iocifvv 
mhicm  ms MfMAVicn  toe  if  tv 
AMMICfttl  Mf  (UROLOGICAL  SOCICVV 
MUtCM  NfaSHMl  PuOLISmMS  itSOC 
AMRftAN  PRINTING  KOUtf  9  OR  TH|  ti  I  N> 

ASSOCIATION  OR  AMfRfCAN  UNfvCRSIT?  PRfftfS 

CfNTtA  ROC  APPLlfO  LINGUISTICS 

cot 09 1*1  HISS  ICC 

COMPOSITION  I M OR MAT I ON  MRVlCCS 

COMPuTf R  THISirriM  INC 

CURTIS  PUOllSMiMC  CO 

OMtUffNf  iMIWRSnv  HISS 

MMUI|0«  OR  IHIICM  SOCICTI IS  PQR  RIOLOCV 

GRAPHIC  ARTS  TCCNWICAL  ROUMOAT ION 

HARR  I S-  INTI R  f VPf  CORPORA  T ION 

NMVIflO  UN !  Vf  A S I  TV 

MfRCICR  ANO  m«LO 

INOIANA  STAff  COLLlCC 

INAORONICI 

fNl FAN AT I0NA4  CuSINfSS  NACHlNf  COLORATION 

■NFCRNAriONAt  TYPOGRAPHIC  COMPOSITION  ASSOC  INC 

ITtA  CORPORATION 

JOHN  WUCT  ANO  SONS  INC 

JOHNS  NOHIAS  UNIVfRSITV 

R INC SPORT  PfltSS  INC 

LANCASTCR  PRfSS 

LAMS  TON  NOMOTTPC  CO 

LOS  AMCUCS  TIMS 

LOUISIANA  STATf  UMlVfRSITv  PRfSS 

HACK  PRINTING  CO 

MASSACHUSETTS  INST I  fuff  OP  UCnnOLOGY 
NC  CRAM-HILL  ROOR  COMPANY  INC 
NfROiNTHALCR  LlNOTTPC  CO 
NATIONAL  INSTITUTCS  OP  MfALlM 
PHOTON  INC 

PRINTING  PCOOUCTION  MACAU Nf 

A ANO  CORPORATION 

■AMOOH  HOUff  INCORPORA TfO 

Rf  SI  ARCH  ANO  INC  IN(  I A  INC  COUNCIL  OP  T»*  GRAPHIC  ARTS 
ROCAPPI 

Sunn  (A  InSMTuTC  OR  LINGUISTICS 
STSTCN  OCvCLOPMNT  CORPORATION 
TnONAS  Ml  SON  « NO  SONS 
O  S  COMtRNMNT  PRINTING  OPR  ICC 
UNlVfRSITT  OR  AIARARA  PRfSS 
UNlVfRSITT  OP  NORTH  CAROLINA 
UNlVfRSTlT  OP  MISCONSIN  PRtiS 

vail-aallou  pRtss  inc 

M  A  AIM  JAN  IN,  INC 
M  •  SAUNOCRS  CO 
MAM  At  R  PRfSS 

MILL  IANS  AND  Mil Af NS  PUOLISMfRS 


We  are,  of  course,  still  interested  in  making  further 
arrangements  to  receive  the  typesetting  tapes  from  other 
printers.  Of  particular  interest  and  use  to  us  are  the 
tapes  from  books  by  renowned  authors,  poetry  collections, 
biographies,  and  versions  of  the  various  Bibles.  We  use 
the  text  from  these  tapes  solely  for  research  in  computer¬ 
ized  text  processing  such  as  automatic  indexing,  abstract¬ 
ing,  and  classification--never  in  a  way  not  approved  by 
the  printer  who  supplied  it. 

One  obvious  benefit  to  the  printer  from  our  text 
collection  will  be  the  printing  needs  created  for  publi¬ 
cation  of  the  research  done  by  us  and  others  on  the  text 
which  we  accumulate. 


jcr 

M)  Cf MCI  1 C 

42)  DtflNlTION 

SMC  AM 

Iff  UT 

422  IMTfNMA 

WCMIED 

StLfCTfO 

421  Nf TA40LIC 

of 

420  MMO 

414  STAAIN 

MU 

*1*  MOTIONS 

414  MONMONtS 

Of FUSI ICO 

«IT  SCI 

414  ICIAT IONS 

IMIUU 

Alt  FOANACf 

414  VISCOSITY 

»l« 

*17  CONTINUOUSLY 

417  4  UNO 

OAAIN 

41*  OCCUMMMCt 

41)  IN*  1  AltO 

If  ACT 

414  MOSTLY 

414  SHIMS 

MOLCS 

41)  CMAMOINC 

41)  SAMOS 

ISOTOFCS 

4U  floon 

412  f AIMS V 

COnTIMCnIAC 

410  VH4T 

404  Sll 

SMUttflT 

400  SkfMS 

404  NICML 

OMUM 

40*  OONOS 

404  SINFLV 

oaocis 

*«T  MMMTIO* 

40*  FtaNfcMIAlIJN 

BfLATIONSMIFS 

40)  HICMCST 

40)  VISUAL 

COM 

40)  ACTS 

40)  TIANSFOAI 

OOSf 4VATI0N 

402  HICIOOtCANISMS 

402  TAILUAf 

coot 

402  CfLLULOSf 

402  ALLOY 

UHfSTUM 

401  cocoas 

401  4MANCN 

SUUIflK 

400  LONCI TUOI MAC 

144  IfCS 

fNVlAONMMIAL 

1*4  0O4SAC 

144  ATTACK 

MUMS 

1*4  CUMF (CONATION 

)4S  UftlATfO 

iftUilUS 

)4?  Sf COS 

)4*  OASIT 

cvffOMim 

1*4  COMOIN4TIONS 

14*  IfSTIMC 

MCIMC 

1*1  UONt 

14)  FIOCfOOMS 

OSC ILL ATOM 

)44  SULFATC 

14)  FHVS10L0GICAL 

CW1IW 

)•)  MIOCt 

)4)  VAIIAOif S 

HUIS 

>42  fUTir 

142  IMSTAMCI 

ftuitic 

)*2  raCATCO 

141  1  AO!  Al 

im 

140  fnofacatiun 

3*0  IfNOfMCV 

satisfactoat 

Ml  SUMS 

144  MOHS 

stations 

>47  SITC 

)4?  CIVIMC 

CAAOItMT 

)•*  TMAMSITION 

is)  sfNSoav 

IOCMTICAL 

)■)  0*1  VC 

)4)  OISCOVEAkO 

MINT  Im* 

M4  01  IOCS 

>44  OCFTMS 

1MCIM 

)•)  MANUALLY 

>4)  MATMCMAT ICAL 

A  OA*UO 

)«)  If  FT  Ilf S 

142  CAC If OANI A 

M 


This  slide  shows  such  an  example  fi;om  a  page  of  the 
descending  frequency  list  from  the  McGraw-Hill 
Encyclopedia  of  Science  and  Technology.  This  listing  was 
created  completely  automatically  by  our  computer  from 
the  text  which  we  collected  from  the  Teletypesetter  tapes 
used  to  do  the  printing.  Also,  we  can  make  statistics 
available  to  the  printer  from  the  tapes  he  supplies.  Such 
statistics  could  be  useful  in  helping  him  design  a  new 
matrix  arrangement,  for  example. 

PENELOPE  (Pitt  Natural  Language  Processor) 

PENELOPE,  the  Pitt  Natural  Language  Processor, 
was  designed  to  satisfy  the  need  for  a  computer  language 
capable  of  processing  text  efficiently  and  easily. 
PENELOPE  was  designed  specifically  to  allow  the 


programmer  to  write  his  program  in  a  way  which  would  be 
natural  to  him.  PENELOPE  then  translates  his  state¬ 
ments  into  code  which  can  be  understood  and  executed  by  a 
computer.  Examples  of  PENELOPE'S  capabilities  are 
shown  and  explained  in  a  paper  which  I  presented  at  last 
year's  TAGA  meeting  in  Pittsburgh.  This  paper  appeared 
in  its  complete  form  in  the  1964  TAGA  proceeding, 
therefore  I  will  not  go  into  detail  here. 

The  translator  for  PENELOPE  has  been  completed 
and  is  in  use  on  the  IBM  7070  at  the  University  of  Pittsburgh. 
Copies  of  this  program  are  available,  free,  upon  request, 
as  are  most  of  the  routines  developed  by  our  Center.  A 
technical  write-up  is  also  available  upon  request. 

COMPUTERIZED  TYPESETTING 

Our  progress  in  computerized  typesetting,  since  my 
talk  at  last  year's  meeting,  involves  our  advancing  from  a 
theoretical  approach  to  actual  production.  Last  year  I 
spoke  of  what  could  be  done  if  we  had  a  piece  of  photo¬ 
typesetting  equipment.  This  year  I  will  tell  you  what  we 
have  done  with  the  Photon- 560  which  we  have  since  acquired 
and  what  we  are  planning  to  do. 

For  the  justification  part  of  our  system  we  are  using 
a  modified  version  of  the  PC6  system  which  was  originally 
conceived  by  Dr.  Michael  P.  Barnett.  One  feature  of  the 
original  system  which  we  hoped  to  improve  was  to  reduce 
the  great  number  of  keystrokes  required  to  insert  the 
printing  control  information  such  as  type  size  and  type  font. 


We  feel  we  have  accomplished  a  means  of  doing  this  as  we 
demonstrated  when  we  prepared  the  control  tapes  for  a 
bibliography  for  learning  resear cn  as  shown  on  my  next 
slide. 


1  108.  Trmxler,  Arthur  E. 

2  "On*  Reading  T**t  Serve*  th*  Purpose , " 

3  (The  Clearing  House,]  XIV  (torch,  1940),  419-21. 

4  Present*  correlation*  (a)  between  score*  on  different 
fora*  of  four  reading  t*at*  administered  a 

year  apart  and  (b)  between  different  reading  teats 
administer* s  at  Intervals  of  one  year. 
mmngi»mimpnnnxm«x»agnxpgmm 
lAL, ML,V*  UtCP.DLSlTraxler,  Arthur  E. 
lDL2]**une  Reading  Test  Serves  the  Purpose," 

(HL4](DLl6]The  Clearing  House, lBL4]  XIV  (torch,  1940),  419-21. 
lDL2]Prearnts  correlations  (a)  between  scores  on 
different  forme  of  four  reading  teste  administered 
a  year  apart  and  (b)  between  different  reading 
tests  administered  at  Intervals  of  on*  year. 

In  punching  these  tapes  the  only  signals  to  the  computer 
which  the  keyboarder  inserted  were  the  code  numbers 
1,  2,  3,4  in  the  left  hand  margin  and  brackets  around  any 
text  which  was  to  be  in  italics.  With  a  simple  pre¬ 
processing  computer  program  we  then  expanded  these  into 
the  appropriate  codes,  thereby  eliminating  many  key¬ 
strokes.  This  slide  shows  one  of  the  entries  from  this 
bibliography.  The  top  of  the  slide  shows  how  it  appeared 
as  originally  keyboarded  and  below  is  shown  how  it  looked 
after  the  control  codes  were  automatically  inserted. 

Another  feature  which  I  indicated  that  we  were  going 
to  add  to  our  computer -typesetting  system  was  the  hyphen- 

f 


ation  capability.  Currently,  we  have  a  member  of  our  staff 
working  on  such  a  routine  and  hope  to  have  it  finished  by 
the  end  of  the  summer.  However,  even  after  our  hyphen¬ 
ation  routine  is  completed  our  computer  will  try  to  justify 
each  line  by  word  spacing  and  letter  spacing,  as  we  have 
been  doing,  in  order  to  save  computer  time.  In  an  effort 
to  maintain  graphic  arts  quality  we  have  set  upper  and 
lower  limits  on  such  spacing. 

EDITING,  FORMATING  AND  AUTHOR'S  ALTERATIONS 

It  is  in  the  area  of  man-machine  editing  that  I  feel  we 
have  made  our  most  significant  progress.  We  have 
written  and  are  currently  using  a  general  purpose  text 
editing /formating  routine.  This  program  is  written  for  a 
small  scale  computer  (the  PDP-4)  which  is  connected  to 
our  7090  on  an  interrupt  basis.  Text  can  be  accepted 
either  from  cards,  magnetic  tape,  or  the  various  kinds  of 
paper  tape.  The  text  is  then  displayed  on  a  cathode  ray 
tube  screen,  and  the  operator  is  able  to  make  the  changes 
he  desires  by  use  of  a  light  pen  and  a  typewriter  keyboard. 

The  operator  can  use  the  light  pen  to  indicate  which 
of  several  editing  functions  he  wishes  to  perform.  He  does 
this  simply  by  pointing  his  light  pen  at  the  desired  function 
which  appears  at  the  bottom  of  the  screen.  A  picture  of 
the  screen  containing  these  codes  is  shown  on  my  next 
slide. 


-I  The  MO l1  function 


The  ^  o  r  d  poor!  g  p  1  3  c  e  d_  are  re  a  r  - 
r  a  n  q  e  d  t  o  r  b  e  1 1  e  r  s 1  y  1  e  .  h 


rmt 

TYP 

c’hS 


UMT 

TYH 

SLO 


WTM 

DEL 

REU 


RUD 

MOU 

RUN 


SBC 

SPG 

HLT 


IN 

MAN 


OUT 

CD 


B  I  G 
DR  D 


DMP 

LD 


DMT 

CLR 

PUD 


Currently  the  editing  program  has  the  ability  to 

(1)  RMT  (Read  Magnetic  Tape)  -  Read  input  or 
corrections  from  magnetic  tape. 

(2)  WMT  (Write  Magnetic  Tape)  -  Copy  the  text  which 
is  currently  on  the  screen  onto  magnetic  tape. 
(Does  not  alter  what  is  on  screen.  ) 

(3)  DMT  (Dump  Magnetic  Tape)  -  Write  the  text  which 
is  on  the  screen  onto  magnetic  tape  and  clear  the 
screen. 

(4)  WTM  (Write  Tape  Mark)  -  End  of  current  job. 

(5)  RWD  (Rewind  Magnetic  Tapes)  -  Go  to  the  begin¬ 
ning  of  the  magnetic  tapes. 

(6)  SBC  (Switch  B  and  C)  -  Interchange  the  input  and 
output  tapes  to  allow  the  user  to  read  back  what 
he  has  just  written. 


(7)  TYP  (TYPi1)  -  This  will  produce  on  the  typewriter 
a  hard  copy  of  the  contents  of  the  screen. 

(8)  TYH  (TYpe  Halt)  -  This  command  will  stop  the 
typing. 

(9)  CLR  (CLeaR)  -  Erase  the  text  from  the  screen. 

(10)  DEL  (DELete)  -  Erase  a  specified  part  of  the  text. 

(11)  MOV  (MOVe)  -  Move  a  specified  part  of  the  text  to 
another  specific  point. 

(12)  SPG  (Special  Pattern  Generator  )  -  This  control 
allows  the  user  to  change  the  character  set  being 
used. 

(13)  IN  (IN)  -  Read  paper  tape,  display  text  on  screen. 

(14)  OUT  (OUT)  -  Punch  paper  tape  containing  text 
from  the  screen  but  leave  text  on  screen. 

(15)  DMP  (DuMP)  -  Punch  paper  tape  containing  the 
text  from  the  screen  and  clear  the  screen. 

(16)  BIG  (BIG)  -  Punch  paper  tape  so  that  the  holes 
form  the  shapes  of  the  letters  on  the  screen. 

(17)  RUN  (RUN)  -  This  light  button  will  cause  the  text 

to  move  up  the  screen  with  the  first  line  dis¬ 
appearing  off  the  top  and  additional  text  appearing 
along  the  bottom. 

(18)  FAS  (FASt)  -  This  will  cause  the  text  to  move 
faster  (See  RUN). 

(19)  SLO  (SLOw)  -  This  will  cause  the  text  to  move 

slower. 

(20)  FWD  (ForWarD)  -  This  will  cause  the  text  to  move 
up  the  screen  and  is  used  to  cancel  the  affect  of 


the  REV  command. 

(21}  REV  (REVerse)  -  This  will  cause  the  text  to 
backup  with  the  top  lines  reappearing  and  the 
bottom  lines  disappearing. 

(22)  HLT  (HaLT)  -  This  will  stop  the  text  from  moving. 

(23)  MAN  (MANual)  -  This  command  will  move  the 

text  one  line  at  a  time  in  same  way  as  RUN. 

As  I  have  indicated  some  commands  such  as  MOV  work 
only  with  a  specified  portion  of  the  text.  The  last  three 
light  button  allow  pointers  to  be  placed  in  the  text  to 
specify  what  is  to  be  moved. 

(24)  LD  (Left  Delimeter)  -  will  allow  placement  of  the 
left  pointer. 

(25)  RD  (Right  Delimeter)  -  will  allow  placement  of 
the  right  pointer,  and 

(26)  CD  (Cursor  Defined)  -  will  allow  placement  of  an 
additional  pointer  to  indicate  to  where  the  text 

is  to  be  moved. 

Presently  all  of  these  commands  are  built  in  only 
through  programming  and  are  not  part  of  the  hardware. 

This  allows  us  a  great  amount  of  flexibility  in  making 
modifications  and  additions.  For  example,  one  addition 
which  is  currently  being  considered  in  the  COPY  command 
which  will  allow  the  operator  to  duplicate  some  portion  of 
the  text  on  the  screen.  Another  alteration  which  we  are 
considering  is  to  divide  the  screen  in  half,  by  programming 
of  course,  in  order  to  be  able  to  accept  and  output  text 
from  two  independent  sources.  Then  the  main  text  could 


be  read  into  the  top  half  of  the  screen  and  insertions  could 
be  read  into  the  bottom.  The  operator  could  combine 
them  as  he  wishes. 

As  a  testimonial  to  Lhe  usability  of  these  routines, 
several  of  the  secretaries  on  our  staff,  with  absolutely 
no  computer  training  have  used  this  routine  in  typing 
papers  in  order  to  allow  for  ease  of  "author  alterations." 
In  fact,  the  preliminary  drafts  of  the  paper  I  have  just 
presented  were  prepared  using  this  system. 


Bibliography 


Bacon,  Charles  R.  T.  ,  Text  Editor  2,  Technical  Report, 
University  of  Pittsburgh  Computation  and  Data 
Processing  Center,  Pittsburgh,  Pennsylvania 
April  8,  1965 

Barnett,  M.  P.  .  and  D.  A.  Luce.  The  TYPRINT  System 

Cooperative  .Computing  Laboratory,  Massachusetts 
Institute  of  Technology,  Cambridge,  Massachusetts, 
Reprinted  May  3,  1965  by  the  University  of 
Pittsburgh  Computation  and  Data  Processing 
Center. 

Isner,  Dale  W.  ,  PENELOPE,  The  Pitt  Natural  Language 
Processor,  University  of  Pittsburgh  Computation 
and  Data  Processing  Center,  Pittsburgh, 
Pennsylvania,  April,  1965. 

Ohringer,  Lee,  "Computer  Input  from  Printing  Control 
Tapes,  "  TAGA  Proceedings  1964,  p.  304-316, 
Rochester,  New  York. 


